NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Topological Risk-Landscape in Metric-Free Categorical Database

https://doi.org/10.1109/ACCESS.2024.3398416

Fushing, Hsieh; Kao, Hong-Wei; Chou, Elizabeth P (January 2024, IEEE Access)

Full Text Available
Unraveling Hidden Major Factors by Breaking Heterogeneity into Homogeneous Parts within Many-System Problems

https://doi.org/10.3390/e24020170

Chou, Elizabeth P.; Chen, Ting-Li; Fushing, Hsieh (February 2022, Entropy)

For a large ensemble of complex systems, a Many-System Problem (MSP) studies how heterogeneity constrains and hides structural mechanisms, and how to uncover and reveal hidden major factors from homogeneous parts. All member systems in an MSP share common governing principles of dynamics, but differ in idiosyncratic characteristics. A typical dynamic is found underlying response features with respect to covariate features of quantitative or qualitative data types. Neither all-system-as-one-whole nor individual system-specific functional structures are assumed in such response-vs-covariate (Re–Co) dynamics. We developed a computational protocol for identifying various collections of major factors of various orders underlying Re–Co dynamics. We first demonstrate the immanent effects of heterogeneity among member systems, which constrain compositions of major factors and even hide essential ones. Secondly, we show that fuller collections of major factors are discovered by breaking heterogeneity into many homogeneous parts. This process further realizes Anderson’s “More is Different” phenomenon. We employ the categorical nature of all features and develop a Categorical Exploratory Data Analysis (CEDA)-based major factor selection protocol. Information theoretical measurements—conditional mutual information and entropy—are heavily used in two selection criteria: C1—confirmable and C2—irreplaceable. All conditional entropies are evaluated through contingency tables with algorithmically computed reliability against the finite sample phenomenon. We study one artificially designed MSP and then two real collectives of Major League Baseball (MLB) pitching dynamics with 62 slider pitchers and 199 fastball pitchers, respectively. Finally, our MSP data analyzing techniques are applied to resolve a scientific issue related to the Rosenberg Self-Esteem Scale.
more » « less
Full Text Available
Categorical Nature of Major Factor Selection via Information Theoretic Measurements

https://doi.org/10.3390/e23121684

Chen, Ting-Li; Chou, Elizabeth P.; Fushing, Hsieh (December 2021, Entropy)

Without assuming any functional or distributional structure, we select collections of major factors embedded within response-versus-covariate (Re-Co) dynamics via selection criteria [C1: confirmable] and [C2: irrepaceable], which are based on information theoretic measurements. The two criteria are constructed based on the computing paradigm called Categorical Exploratory Data Analysis (CEDA) and linked to Wiener–Granger causality. All the information theoretical measurements, including conditional mutual information and entropy, are evaluated through the contingency table platform, which primarily rests on the categorical nature within all involved features of any data types: quantitative or qualitative. Our selection task identifies one chief collection, together with several secondary collections of major factors of various orders underlying the targeted Re-Co dynamics. Each selected collection is checked with algorithmically computed reliability against the finite sample phenomenon, and so is each member’s major factor individually. The developments of our selection protocol are illustrated in detail through two experimental examples: a simple one and a complex one. We then apply this protocol on two data sets pertaining to two somewhat related but distinct pitching dynamics of two pitch types: slider and fastball. In particular, we refer to a specific Major League Baseball (MLB) pitcher and we consider data of multiple seasons.
more » « less
Full Text Available
Unraveling the Regional Specificities of Malbec Wines from Mendoza, Argentina, and from Northern California

https://doi.org/10.3390/agronomy9050234

Fushing, Hsieh; Lee, Olivia; Heitkamp, Constantin; Heymann, Hildegarde; Ebeler, Susan E.; Boulton, Roger B.; Koehl, Patrice (May 2019, Agronomy)

This study explores the relationships between chemical and sensory characteristics of wines in connection with their regions of production. The objective is to identify whether such characteristics are significant enough to serve as signatures of a terroir for wines, thereby supporting the concept of regionality. We argue that the relationships between characteristics and regions of production for the set of wines under study are rendered complicated by possible non-linear relationships between the characteristics themselves. Consequently, we propose a new approach for performing the analysis of the wine data that relies on these relationships instead of trying to circumvent them. This new approach follows two steps: We first cluster the measurements for each characteristic (chemical, or sensory) independently. We then assign a distance between two features to be the mutual entropy of the clustering results they generate. The set of characteristics is then clustered using this distance measure. The result of this clustering is a set of sub-groups of characteristics, such that two characteristics in the same group carry similar, i.e., synergetic information with respect to the wines under study. Those wines are then analyzed separately on the different sub groups of features. We have used this method to analyze the similarities and differences between Malbec wines from Argentina and California, as well as the similarities and differences between sub-regions of those two main wine producing countries. We report detection of groups of features that characterize the origins of the different wines included in the study. We note stronger evidence of regionality for Argentinian Malbec wines than for Californian wines, at least for the sub regions of production included in this study.
more » « less
Full Text Available

Search for: All records